A Modular k-Nearest Neighbor Classification Method for Massively Parallel Text Categorization

نویسندگان

  • Hai Zhao
  • Bao-Liang Lu
چکیده

This paper presents a Min-Max modular k-nearest neighbor (M-k-NN) classification method for massively parallel text categorization. The basic idea behind the method is to decompose a large-scale text categorization problem into a number of smaller two-class subproblems and combine all of the individual modular k-NN classifiers trained on the smaller two-class subproblems into an M-k-NN classifier. Our experiments in text categorization demonstrate that M-k-NN is much faster than conventional k-NN, and meanwhile the classification accuracy of M-k-NN is slightly better than that of the conventional k-NN. In practical, M-k-NN has intimate relationship with high order k-NN algorithm; therefore, in theoretical sense, the reliability of M-k-NN has been supported to some extend.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

Application of k Nearest Neighbor on Feature Projections Classi er to Text Categorization

This paper presents the results of the application of an instance based learning algorithm k Nearest Neighbor Method on Fea ture Projections k NNFP to text categorization and compares it with k Nearest Neighbor Classi er k NN k NNFP is similar to k NN ex cept it nds the nearest neighbors according to each feature separately Then it combines these predictions using a majority voting This prop er...

متن کامل

Application of k - Nearest Neighbor on FeatureProjections Classi er to Text

This paper presents the results of the application of an instance-based learning algorithm k-Nearest Neighbor Method on Feature Projections (k-NNFP) to text categorization and compares it with k-Nearest Neighbor Classiier (k-NN). k-NNFP is similar to k-NN except it nds the nearest neighbors according to each feature separately. Then it combines these predictions using a majority voting. This pr...

متن کامل

Neighbor-weighted K-nearest neighbor for unbalanced text corpus

Text categorization or classification is the automated assigning of text documents to pre-defined classes based on their contents. Many of classification algorithms usually assume that the training examples are evenly distributed among different classes. However, unbalanced data sets often appear in many practical applications. In order to deal with uneven text sets, we propose the neighbor-wei...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004